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© Pipeline data processing apparatus having small power consumption. 

@ In a pipeline processing apparatus including a plurality of serially connected stages (STi, ST2,«««). a 
plurality of clock signals (CLK1, CLK2,»»») are supplied to the stages individually. The clock signals can be 
individually stopped. 
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BACKGROUND OF THE INVENTION 
Field of the Invention 

5 The present invention relates to a pipeline processing apparatus including a plurality of serially- 
connected stages, each stage having a plurality of flip-flops and a logic gate combination circuit. 

Description of the Related Art 

70 A microprocessor such as a pipeline processing apparatus includes a plurality of stages each having a 
plurality of flip-flops and a logic gate combination circuit. About half of the entire power consumption is 
dissipated in the flip-flops. Also, about half of that half is dissipated in a clock driving circuit for driving a 
clock signal supplied to the flip-flops, and the remainder of that half is dissipated in the flip-flops per se and 
their outputs. 

75 Generally, in a complementary metal oxide semiconductor (CMOS) large scale integrated circuit (LSI), 
power consumption is mainly dependent upon dynamic power consumption caused by charging and 
discharging operations performed upon a load capacity, and can be represented by (see Neil Weste et al, 
"PRJNCIPLES OF CMOS VLSI DESIGN", pp. 144-149, 1985) 

20 P = C L V DD 2 f p (1) 

where 

P is a power consumption; 
Ct is a load capacity; 

25 V D d is a power supply voltage; and 

f p is the frequency of a signal. If the signal is a clock signal whose frequency is f c , then f p = f c . If the signal 
is an output signal of a flip-flop, then f p *=■ 1/4 f c in view of the probability of transition of the output signal 
from high to low and vice versa. 

In the pipeline processing apparatus, however, the output signals of the flip-flops are not always 

30 changed from high to low or vice versa in accordance with the clock signal. Each output signal of the flip- 
flops may be changed once for ten clock signals on the average, and in this case, f p 1/10 f c . This means 
about 90 % of the power consumption dissipated in the clock driver circuit can be wasted. This will be 
explained later in detail. 

35 SUMMARY OF THE INVENTION 

It is an object of the present invention to reduce the power consumption of a pipeline processing 
apparatus including a plurality of serially-connected stages each having at least a plurality of flip-flops. 

According to the present invention, in a pipeline processing apparatus including a plurality of serially 
40 connected stages, a plurality of clock signals are supplied to the stages individually. The clock signals can 
be individually stopped, so that wasteful transitions of the clock signals are reduced. 

BRIEF DESCRIPTION OF THE DRAWINGS 

45 The present invention will be more clearly understood from the description as set forth below, in 
comparison with the prior art, with reference to the accompanying drawings, wherein: 

Fig. 1 is a circuit diagram illustrating a prior art pipeline processing apparatus; 

Figs. 2A through 2F are timing diagrams showing the operation of the circuit of Fig. 1; 

Figs. 3A and 3B are circuit diagrams illustrating examples of flip-flops; 
so Fig. 4 is a circuit diagram illustrating a first embodiment of the pipeline processing apparatus according 

to the present invention; 

Figs. 5A through 5K are timing diagrams showing the operation of the circuit of Fig. 4; 
Fig. 6 is a circuit diagram illustrating a second embodiment of the pipeline processing apparatus 
according to the present invention; 
55 Figs. 7A through 7K are timing diagrams showing the operation of the circuit of Fig. 6; and 

Fig. 8 is a circuit diagram illustrating a third embodiment of the pipeline processing apparatus according 
to the present invention. 
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DESCRIPTION OF THE PREFERRED EMBODIMENT 

Before the description of the preferred embodiment, a prior art pipeline processing apparatus will be 
explained with reference to Figs. 1, 2A through 2F, and 3A and 3B. 

5 In Fig. 1 , which illustrates a prior art pipeline processing apparatus, a plurality of stages STi , ST 2 , • • • 
are provided. The first stage STi is comprised of flip-flops FFn, FF12, ••• for receiving data DAi, DA2, 
••• , respectively, and a logic gate combination circuit C1 for receiving the output data DB1, DB2, •• • of 
the flip-flops FFn, FF12, ••• . In this case, the flip-flops FFn, FF12, ••• are of a D-type which operate in 
response to their input rise edges. Note that the logic gate combination circuit C1 is comprised of logic 

10 gates such as AND circuits, NAND circuits, OR circuits and NOR circuits, but includes no fiip-flop and no 
latch circuit. Also, the data DAi , DA2, • • • are supplied to the first stage STi through selectors. SELi , SEU, 

• • • which are controlled by a stall signal STL for stopping the pipeline operation of the pipeline processing 
apparatus. That is, when the stall signal STL is low (= "0"), the data DA*, DAs, • • • are supplied to the flip- 
flops FFn, FF12. so that the contents of the flip-flops FFn, FF12, ••• are changed. On the other 

75 hand, when the stall signal STL is high ( = "1"), the output signals of the flip-flops FFn , FF12, • • • are fed 

back to the inputs thereof, so that the contents of the flop-flops FFn , FF12, • • • are not changed. 

Also, the second stage ST 2 is comprised of flip-flops FF 2 i, FF 2 2, and a logic gate combination 

circuit C2 for receiving the output data DC1 , DC2, • • • of the flip-flops FF21 , FF22, • • • • The third stage ST3 

and its post-stages have the same configuration as the second stage ST 2 . 
20 Further, a clock signal CLK is supplied to all the flip-flops FFn. FF12, FF 2 i, FF 22 . FF 3 i, FF32 

, and therefore, the flip-flops FFn, FF12, FF21, FF22, ••• , FF31, FF32, ••• are simultaneously 

operated. 

The operation of the pipeline processing apparatus of Fig. 1 will now be explained with reference to 
Figs. 2A through 2F. 

25 As shown in Figs. 2A and 2B, the clock signal CLK and the data DA (DAn DA2, •••) are always 
generated. In this state, as shown in Fig. 2C, the stall signal STL is "0" at rise-edge timings to, ti, U and 
of the clock signal CLK, so that the selectors SELi, SEU. ••• select the data DA. As a result, the output 
data DB (DBi, DB2, •••) of the flip-flops FFn, FF12, ••• are data obtained by delaying the data DA by 
one clock time period AT, as shown in Fig. 2D. On the other hand, as shown in Fig. 2C, the stall signal STL 

30 is "0 n at rise edge timings t2, t3 and of the clock signal CLK, so that the output data DB of the flip-flops 
FFn FF12, ••• are not changed as shown in Fig. 2D. Also, the second stage ST 2 and its post stages 
always receive the clock signal CLK, and therefore, the operation results of the logic gate combination 
circuits Ci , C2, • • • based upon the outputs of their prestage flip-flops are written into the flip-flops of the 
second stage ST2 and its post stages, as shown in Figs. 2E and 2F. 
- 35 In the pipeline processing apparatus of Fig. 1 , however, even during time periods where the contents of 
the flip-flops are not changed due to the stall signal STL, the flip-flops receive the clock signal CLK so as to 
operate them (see t2 and t3 of Fig. 2D, t3 and U of Fig. 2E and U and ts of Fig. 2F). This increases the 
power consumption. 

For example, if each of the flip-flops is of a static type as illustrated in Fig. 3A, the power consumption 
AO of the pipeline processing apparatus of Fig. 1 is calculated below. Here, the following conditions are 
assumed; 

an input capacity of the clock signal CLK to each flip-flop = 0.06 pF; 
an internal load capacity of each flip-flop = 0.07 pF; 

an input capacity of the stall signal STL to each of the selectors SELi , SEL2, • • • = 0.04 pF; 
45 an internal load capacity of each of the selectors SELi , SEU, • • • = 0.05 pF; 

the number of the flip-flops FF1 1 , FFi 2 , • • • » 40; 

the number of the flip-flops FF21 , FF21 , • • • =20; 

the number of the flip-flops FF31 , FF32. • • • =30; 

an internal load capacity of the logic gate combination circuit C1 = 20 pF; 
50 an internal load capacity of the logic gate combination circuit C2 = 10 pF; 

V DD = 5V; 

the frequency f c of the clock signal CLK = 50 MH 2 ; 

the probability of "1" within the stall signal STL = 2/5, i.e., the frequency of the stall signal STL = 2/5 

• 50 MH Z ; and 

55 the frequency of other logic signals = 1J4 (due to the fact that the probability of transition of the output 
signal of each flip-flop at unstalled timings where the output signal is expected to change is 1/2, i.e., the 
transtion frequency is 1/4 f s ). 
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Therefore, from the equation (1), 

P = (40 + 20 + 30) • 0.06 • 10~ 12 x 5 s x 50 • 10 6 
+ (40 + 20 + 30) • 0.07 • 10" 12 x S 2 x 1/4 • 2/5 • 50 • 10 6 
5 + 40 • 0.04 • 10" 12 x S 2 x 2/5 • 50 • 10 6 

+ 40 • 0.05 • 10" 12 x S 2 x 1/4 • 2/5 • 50 • 10 6 

+ (20 + 10) • 10" 12 x S 2 x 1/4 • 2/5 • 50 • 10* (2) 

where the first term is a power consumption dissipated by the clock signal CLK; the second term is a 
to power consumption dissipated within the flip-flops; the third term is a power consumption dissipated by the 
stall signal STL; the fourth term is a power consumption dissipated in the selectors SELi, SEU. and 
the fifth term is a power consumption dissipated in the logic gate combination circuits Ci and C2. The 
equation (2) is represented by 

75 

P = (6,75 + 0.79 + 0.80 + 0,25 + 3.75) -10-3 

= 12.34mW (3) 

20 Thus, 55 percent of the entire power consumption (12.34 mW) is dissipated by the clock signal CLK, 
and 2/5 of this power consumption (22 percent of the entire power consumption) is dissipated when the 
pipeline processing apparatus is stalled. Also, 9 percent (0.80 + 0.25 mW) of the entire power consumption 
is dissipated in the selectors SELi , SEI_2,» • •. In other words, 31 percent of the entire power consumption 
does not contribute to the pipeline operation of the pipeline processing apparatus, and therefore, the 31 

25 percent of the entire power consumption is wasteful. 

On the other hand, if each of the flip-flops is of a dynamic type as illustrated in Fig, 3B, since the 
number of transistors is reduced as compared with the static type flip-flop as illustrated in Fig. 3B, the input 
capacity of the clock signal CLK to each flip-flop is decreased from 0^06 pF to 0.03 pF, and the internal load 
capacity of each flip-flop is decreased from 0.07 pF to 0.04 pF. Therefore, the first term of the equation (2) 

30 is replaced by 

(40 + 20 + 30) • 0.03 • 10" 12 x 5 2 x 50 • 10 s 

and the second term of the equation (2) is replaced by 

35 

(40 + 20 + 30) • 0.04 • 10~ 12 x S 2 x 1/4 • 2/5 x 50 '10 s 

Therefore, in this case, the entire power consumption p is represented by 

P = (3. 38+0,45+0. 80+0. 25+3. 75) • 10-3 

= 8.63mW (4) 

45 Thus, 39 percent of the entire power consumption (8.63 mW) is dissipated by the clock signal CLK, and 
2/5 of this power consumption (16 percent of the entire power consumption) is dissipated when the pipeline 
processing apparatus is stalled. Also, 12 percent (0.80 + 0.25 mW) of the entire power consumption is 
dissipated in the selectors SELi, SEL2,»*». In other words, 28 percent of the entire power consumption 
does not contribute to the pipeline operation of the pipeline processing apparatus, and therefore, the 28 

50 percent of the entire power consumption is wasteful. 

In Fig. 4, which illustrates a first embodiment of the present invention using static type flip-flops as 
illustrated in Fig. 3A, flip-flops FF10, FF20, • • • and OR circuits Q1 , G 2 , G 3 ,- • • are added to the elements 
of Fig. 1, and the selectors SELi, SEU, •••of Fig. 1 are deleted. Each of the flip-flops FF10, FF 20 ,"» 
delays stall signals STL1 ( = STL), STL2, •••by one clock time period AT. The OR circuits Gi, G2, G3, 

55 • • • turn ON and OFF the clock signal in accordance with the stall signals STL1, STL2, STL3, • • •. Thus, 
the stall signals STL1, STL2, STL3, • • • having one clock time period AT therebetween are generated. As a 
result, the flip-flops FFn, FF12, ••• are operated in accordance with a clock signal CLK1 which is an OR 
logic between the clock signal CLK and the stall signal STL1, the flip-flops FF21 , FF22, •• • are operated in 
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accordance with a clock signal CLK2 which is an OR logic between the clock signal CLK and the stall signal 
STL2, and the flip-flops FF 3 i, FF32, ## * are operated in accordance with a clock signal CLK3 which is an 
OR logic between the clock signal CLK and the stall signal STL3. 

The operation of the pipeline processing apparatus of Fig. 4 will now be explained with reference to 
5 Figs. 5A through 5K. 

As shown in Figs. 5A and 5B, the clock signal CLK and the data DA (DAi, DA2, •••) are also always 
generated. In this state, as shown in Fig. 5C, the stall signal STL1 is "0" at rising-edge timings to, h , U and 
. ts of the clock signal CLK and is "1 " at rising edge timings b, k and U of the clock signal CLK. As a result, 
the clock signal CLK1 for the flip-flops FFn, FF12, ••• rises at only timings to, ti, U and fe, as shown in 
70 Fig. 5D. Therefore, the data DB (DBi. DB2,»-») of the flip-flops FFn, FF12, ••• are changed at only 
timings to, ti , U and fe, as shown in Fig. 5E, and therefore, the output data DB are obtained by delaying the 
data DA by one clock time period AT. 

Also, as shown in Fig. 5F, the stall signal STL2 is delayed as compared with the stall signal STL1 by 
one clock time period AT. Therefore, as shown in Fig. 5F, the stall signal STL2 is "0" at rising-edge timings 
15 to, t1.t2.t5 and te of the clock signal CLK and is "1 n at rising edge timings U and U of the clock signal 
CLK. As a result, the clock signal CLK2 for the flip-flops FF21, FF22, ••• rises at only timings to, ti, fc, fe 
and fc, as shown in Fig. 5G. Therefore, the data DC (DC1, DC2,»") of the flip-flops FF21, FF22, •••are 
changed at only timings to, ti . ti, and U, as shown in Fig. 5H, and therefore, the output data DC are 
obtained by delaying the data DB by one clock time period AT. 
20 Also, as shown in Fig. 51, the stall signal STL3 is delayed as compared wilt the stall signal STL2 by one 
clock time period AT. Therefore, as shown in Fig. 51, the stall signal STL3 is "0" at rising-edge timings to, 
ti, t2, t3 and U of the clock signal CLK and is "1" at rising edge timings U and of the clock signal CLK. 
As a result, the clock signal CLK3 for the flip-flops FF31 , FF32, • • • rises at only timings to, ti , t2, t3 and U, 
as shown in Fig. 5J. Therefore, the data DD (DD1 , DD2,« • •) of the flip-flops FFsi , FF32, • • • are changed 
25 at only timings to, ti, ta and te, as shown in Fig. 5K, and therefore, the output data DD are obtained by 
delaying the data DC by one clock time period AT. 

Thus, according to the first embodiment, during a stall period where the change of the flip-flops FFi 1 , 
FFi2, FF21, FF22* FF31, FF32, •••is unnecessary, the generation of the clock signals CLK1, 
CLK2, CLK3, • • • is stopped, thus reducing the power consumption. 
30 An actual power consumption of the pipeline processing apparatus of Fig. 4 will be explained as 
compared with that of the pipeline processing apparatus of Fig. 1. The power consumption of the pipeline 
processing apparatus of Fig. 4 is calculated below. Here, the following conditions are assumed: 
an input capacity of the clock signal CLK to each flip-flop =' 0.06 pF; 
an internal load capacity of each flip-flop = 0.07 pF; 
35 an input capacity of each of the OR circuits Gi , G2, • • • = 0.10 pF; 

an internal load capacity of each of the OR circuits Gi , G2, • • • = 0.50 pF; 
the number of the flip-flops FFi 1 , FF1 2, • • • =40; 
the number of the flip-flops FF21 , FF21 , • • • = 20; 
the number of the flip-flops FF31 , FF32, • • • =30; 
40 an internal load capacity of the logic gate combination circuit C1 = 20 pF; 
an internal load capacity of the logic gate combination circuit C2 = 10 pF; 
V DD = 5V; 

the frequency f c of the clock signal CLK = 50 MH Z ; 

the probability of "1" within the stall signal STL = 2/5, i.e., the frequency of the stall signal STL = 2/5 
45 • 50 MH Z ; and 

the frequency of other logic signals = fs/4. 
Therefore, from the equation (1), 

P = (2 • 0.06 + 3 • 0.10) • 10-' 2 x 5* x 50 • 10 6 
50 + (3 • 0.50 + (40 + 20 + 30) • 0.06) • 10" 12 x S 2 x 2/5 • 50 • 10* * 
+ (40 + 20 + 30) • 0.07 • 10~ 12 x S 2 x 1/4 • 2/5 • 50 • 10 6 
+ (2 • 0.07 + 3 • 0.10) • 10" 12 x x 2/5 • 50 • 10 6 
+ (20 + 10) • 10~ t2 x & x 1/4 • 2/5 • 50 • 10 s (5) 

55 where the first term is a power consumption dissipated by the clock signal CLK; the second term is a 
power consumption dissipated in the OR circuits Gi , G2 and G 3 and by the clock signals CLK1 , CLK2 and 
CLK3; the third term is a power consumption dissipated within the flip-flops; the fourth term is a power 
consumption dissipated by the stall signal STL; and the fifth term is a power consumption dissipated in the 
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logic gate combination circuits Ci and C2. The equation (5) is represented by 

P = (0.53+3.45+0.79+0,25+3.75) -10-3 
5 = 8.74 roW (6) 

Thus, the entire power consumption (8.74 mW) in the first embodiment, validating ail the rise edges of 
the clock signals CLK1, CLK2 and CLM3 can be reduced by 29 percent as compared with that (12.34 mW) 
70 of the prior art pipeline processing apparatus of Fig. 1. This is mainly because the power consumption by 
the clock signals is reduced by 42 percent as compared with the prior art pipeline processing apparatus of 
Fig. 1. Note that the reduction in power consumption by deleting the sectors SELi, SEU, ••• and the 
increase in power consumption by delaying the stall signal STL cancel each other. 

In Fig. 6, which illustrates a second embodiment of the present invention using dynamic type flip-flops 
15 as illustrated in Fig. 3B, an inverter Gn, an AND circuit G12, selectors SEL1, SEU,*** are added to the 
elements of Fig. 5, to thereby carry out a refresh operation by a refresh signal REF. The refresh signal REF 
is a clock pulse signal which is made high (= "1") for every definite time period such as several us. 

The operation of the pipeline processing apparatus of Fig. 6 will now be explained with reference to 
Figs. 7A throuth 7K. When the refresh signal REF is n 0 rt , the operation of the pipeline processing 
20 apparatus of Fig. 6 is the same as that of the pipeline processing apparatus of Fig. 4 (see non-refresh mode 
of Figs. 7A through 7K). That is, the output signal of the inverter Gn is "1", so that the AND circuit G12 
passes the stall signal STL therethrough. In this case, STL1 = STL Simultaneously, the selectors SELi 1 , 
SELa\ • • • select the data DA1 , DA*,- • • , respectively. 

When the stall signal STL1 (= STL) is "1" and accordingly, the stall signals STL2 and SLTL3 are "1", 
25 the refresh signal REF is made "1", as shown in Fig. 7D, so that the control enters a refresh mode. That is, 
the selectors SELi \ SEL2V • • select the outputs DB1 , DB2 »• • • ( = DB), respectively, of the flip-flops FF1 1 , 
FF12, Also, since the refresh signal REF (= "1") is inverted by the inverter Gn and is supplied to the 
AND circuit G12, the output signal of the AND circuit G12 is "0" regardless of the stall signal STL. That is, 
the stall signal STL1 is n 0 n as shown in Fig. 7E. As a result, the clock signal CLK passes through the OR 
30 circuit Gi only for a time period defined by the stall signal STL1 ( = "0") , so that the clock signal CLK1 is 
formed by the clock signal CLK, as shown in Fig. 7F. Therefore, the outputs DB1, DB2,»»» (= DB) of the 
flip-flops FFi 1 , FF1 2 , • • • are again written thereinto, thus refreshing the flip-flops FFn , FFi 2 , • • • . 

Also, the stall signal STL1 is latched by the flip-flop FF10, and as a result, the stall signal STL2 is "0" 
for one clock time period AT, as shown in Fig. 7H. Therefore, the clock signal CLK passes through the OR 
35 circuit G2 only for a time period defined by the stall signal STL2 (= "0"), so that the clock signal CLK2 is 
formed by the clock signal CLK, as shown in Fig. 71. Thus, the outputs DC1 , DC2, • • • (- DC) of the flip- 
flops FF21 , FF22.» ■ • are again written thereinto, thus refreshing the flip-flops FF21 , FF22,« • • . 

Further.the stall signal STL2 is latched by the flip-flop FF 2 o, and as a result, the stall signal STL3 is "CP 
for one clock time period AT, as shown in Fig. 7K. Therefore, the clock signal CLK passes through the OR 
40 circuit G3 only for a time period defined by the stall signal STL3 (= n 0 w ), so that the clock signal CLK3 is 
formed by the clock signal CLK, as shown in Fig. 7L Thus, the outputs DDi , DD2, • • • ( = DD) of the flip- 
flops FF31, FF32,»" are again written thereinto, thus refreshing the flip-flops FF31, FF32,»»«. 

Note that the provision of the selectors SELi, SEU, # " at the prestages of the flip-flops FFn, FF12, 
• •• makes it definitely possible to carry out a refresh operation upon the flip-flops FFn, FF12, ••• even 
45 when the data DA (DAi, DA2,"») are changed during a refresh mode. Contrary to this, during a stalling 
period, the input values of hte flip-flops FF21 , FF 2 2,* ■ • , FF31 , FFs2 ,• • • are not changed, and therefore, no 
selectors are provided at the prestages of the flip-flops FF21 , FF22,* • • , FF31 , FF32,» • • . 

Thus, according to the second embodiment, during a stall period where the change of the flip-flops 
FFn, FFi2,**«, FF21, FF22, FF31, FF32, •••is unnecessary, the generation of the clock signals 
50 CLK1, CLK2, CLK3, • • • is stopped, thus reducing the power consumption. 

The actual power consumption of the pipeline processing apparatus of Fig. 6 will be explained as 
compared with that of the pipeline processing apparatus of Fig. 1. The power consumption of the pipeline 
processing apparatus of Fig. 6 is calculated below. Here, the following conditions are assumed: 
an input capacity of the clock signal CLK to each flip-flop = 0.03 pF; 
55 an internal load capacity of each flip-flop = 0.04 pF; 

an input capacity of the refresh signal REF to each of the selectors SELi', SEU', • • • 3 0.05 pF; 
an internal load capacity of each of the selectors SELi\ SEU', • • • = 0.05 pF; 
an input capacity of each of the OR circuits Gi , G2, • • • ■ 0.10 pF; 
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an internal load capacity of each of the OR circuits Gi , Gfe. • • • = 0.50 pF; 
the number of the flip-flops FFi i , FF12, ♦ • • s 40; 
the number of the flip-flops FF21 , FF21 , • • • =20; 
the number of the flip-flops FF31 , FF 32 , • • • =30; 
5 an internal load capacity of the logic gate combination circuit Ci = 20 pF; 
an internal load capacity of the logic gate combination circuit C2 = 10 pF; 
V DD = 5V; 

the frequency f c of the clock signal CLK = 50 MH Z ; 

the probability of "1" within the stall signal STL = 2/5, i.e., the frequency of the stall signal STL = 2/5 
10 • 50MH Z ; and 

the frequency of other logic signals = V4. 
Therefore, from the equation (1), 

P = (2 • 0. 03 + 3 • 0.10) • 10" 12 x x 50 • 10 6 
75 + (3 • 0.50 + (40 + 20 + 30) • 0.03) • 10~ 12 x?x (2/5+ 1/200) • 50 • 10 6 

+ (40 + 20 + 30) • 0.04 • 10~ 12 x 52 x 1/4 • 2/5 • 50 • 10 6 

+ (2 • 0.04 + 3 • 0.10) • 10" 12 x x (2/5 + 1/200) • 50 • 10 6 

+ (20 + 10) • 10- 12 x 5 2 x 1/4 • 2/5 • 50 • 10 5 

+ 40 • 0.04 • 10" 12 x S 2 x 1/200 • 50 • 10 6 
20 +40 • 0.05 • 10" 12 x x 1/4 • (2/5 + 1/200) • 50 • 10 s (7) 

where the first term is a power consumption dissipated by the clock signal CLK; the second term is a 
power consumption dissipated in the OR circuits G1 , G2 and Ga and by the clock signals CLK1 , CLK2 and 
CLK3; the third term is a power consumption dissipated within the flip-flops; the fourth term is a power 
25 consumption dissipated by the stall signal STL; and the fifth term is a power consumption dissipated in the 
logic gate combination circuits Ci and C2; the sixth term is a power consumption dissipated in the selectors 
SELi\ SEI_2\ • • The equation (7) is represented by 

P = (0.45 + 1. 44 + 0. 45 + 0. 19 + 3. 75 + 0.01 + 0.25) . 10-3 

30 

= 6.54mW (8) 

Thus, the entire power consumption (6.54 nW) in the second embodiment, validating all the rise edges 
of the clock signals CLK1, CLK2 and CLM3 can be reduced by 24 percent as compared with that (8.63 
35 mW) of the prior art pipeline processing apparatus of Fig. 1. This is mainly because the power consumption 
by the clock signals is reduced by 42 percent as compared with the prior art pipeline processing apparatus 
of Fig. 1. 

In the above-mentioned second embodiment as illustrated in Fig.6, the logic gate combination circuits 

Ci and C2 can be of a dynamic type. In this case, the clock signals CLK2 and CLK3 are supplied to the 
40 logic gate combination circuits C1 and C2, respectively, as indicated by dotted arrows in Fig. 6. That is, 

when the clock signal CLK2 (CLK3) is high, the logic gate combination circuit C1 (C2) is precharged, while 

when the clock signal CLK2 (CLK3) is low, the logic gate combination Ct (C2) carries out a logic operation. 

Thus, since the clock signal CLK2 (CLK3) masked by the stall signals STL2 (STL3) is supplied to the logic 

gate combination circuit Ci (C2), the power consumption dissipated by the clock signals can be reduced. In 
45 the prior art pipeline processing apparatus of Fig. 1, the clock signal CLK which has more transitions than 

the clock signals CLK2 and CLK3 may be supplied directly to the logic gate combination circuits Ci and C2 

which are of a dynamic type, thus increasing the power consumption. 

In Fig. 8, which illustrates a third embodiment of the present invention, flip-flops FFn\ FFi 2 >»« and a 

logic gate combination circuit Cr are connected in parallel to the flip-flops FFn', FFt2 , ,*»» and the logic 
50 gate combination Ci of Fig. 4. In other words, one first stage consists double-stages designated by sub 

stages ST1 and STi\ The sub stages ST1 and STi* are switched by OR circuits G1 and Gr and selectors 

SELr and SEL2" which are controlled by a decoding signal DEC. 

In order to operate either the logic gate combination circuit Ci or the logic gate combination circuit Ci \ 

either the clock signal CLK1 or the clock signal CLKV is generated. For example, when the decoding signal 
55 DEC is "1". the clock signal CLK1 is clocked, while when the decoding signal DEC is "0", the clock signal 

CLKV is clocked. 

In the third embodiment as illustrated in Fig. 8, if the logic gate combination circuit Ci forms an 
arithmetic and logic unit (ALU), the logic gate combination circuit Ci 1 forms a barrel shifter, and the logic 
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gate combination circuit C2 forms a data cache memory, the pipeline processing apparatus of Fig. 8 can 
serve as a microprocessor. 

Thus, in the third embodiment as illustrated in Fig. 8, even when there is a logic gate combination 
circuit which is hardly operated, a surplus power consunption therefor can be reduced. 
5 In Fig. 8, a stage other than the first stage can be double-staged. Also, a multi-stage greater than a 
double-stage can be adopted instead of the double-stage. 

As explained hereinbefore, according to the present invention, in a pipeline processing apparatus, since 
power consumption dissipated by a clock signal can be reduced, an overall power consumption dissipated 
in the pipeline processing apparatus can be reduced. 

10 

Claims 

1. A pipeline processing apparatus comprising: 

a plurality of serially-connected stages (STi , ST2, • • •) ; and 
75 a clock signal generating means, connected to said stages, for generating a plurality of clock 

signals (CLK1, CLK2,»»«) and transmitting them to said stages individually, each of the clock signals 
being for operating one of said stages. 

2. An apparatus as set forth in claim 1 , wherein said clock signal generating means receives a stall signal 
20 (STL) for stopping an operation of said apparatus to stop generation of said clock signals individually. 

3. An apparatus as set forth in claim 1 , wherein said clock generating means comprises: 

a plurality of stall signal generating means for generating a plurality of stall signals (STL1, STL2, 
• ••): and 

25 a plurality of gate circuits (G1, G2,*«*),each connected to one of said stall signal generating 

means, each of said gate circuits receiving a common clock signal (CLK) and one of said stall signals 
and passing the common clock signal therethrough in accordance with one of said stall signals. 

4. An apparatus as set forth in claim 3, wherein said plurality of stall signal generating means comprise a 
30 plurality of serially-connected delay circuits for receiving a main stall signal (STL) and delaying it to 

generate the plurality of stall signals. 

5. An apparatus as set forth in claim 3, wherein said plurality of stall signal generating means comprises a 
plurality of serially-connected flip-flops (FF10, FF20, •♦•) clocked by the common clock signal to 

35 generate the plurality of stall signals having a delay time period therebetween determined by the main 
clock signal. 

6. An apparatus as set forth in claim 1, wherein said clock signal generating means comprises: 

means for generating a plurality of stall signals (STL1, SYTL2,»«») having delay time periods in 
40 response to operations of said stages; and 

means for carrying out logic operations between the stall signals and a common clock signal (CLK) 
to generate the clock signals in accordance with results of the logic operations. 

7. An apparatus as set forth in claim 2, wherein said stages are of a dynamic type, 

45 said apparatus further comprising means for receiving a refresh signal (REF) to stop generation of 

the stall signal. 

a An apparatus as set forth in claim 3, wherein said stages are of a dynamic type, 

said apparatus further comprising means for receiving a refresh signal (REF) to stop generation of 
50 the stall signals. 

9. An apparatus as set forth in claim 1 , wherein each of said stages comprises: 

a plurality of flip-flops (FFu, FFi 2 ,-»«, FF21, FF 2 2,«", FF 3 i, FF 3 2,"*). each clocked by one of 
the clock signals; and 

55 a logic gate combination circuit (&,(*,•••) connected to outputs of said flip-flops. 

10. An apparatus as set forth in claim 9, wherein said logic gate combination circuit is of a dynamic type 
clocked by one of the clock signals. 
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11. An apparatus as set forth in claim 1, wherein at least one of said stages includes a plurality of 
parallelly-connected sub stages (STi , STi 

said apparatus further comprising a decoding means (Gi, GO for receiving a decoding signal 
(DEC) to select one of said sub stages. 

5 

12. A pipeline processing apparatus comprising: 

a plurality of serially-connected stages (STi, ST 2 , •••), each including a plurality of first flip-flops 
(FFu, FFi2,»", FF21, FF22,»", FF3i, FF 3 2,*")» each clocked by one of the clock signals; and a 
logic gate combination circuit (Ci f C2,* • •) connected to outputs of said flip-flops; 
10 a plurality of serially-connected second flip-flops (FFn, FF 2 o. # *0 for delaying a main stall signal 

(STL1) to generate a plurality of stall signals (STL2, STL3,« • •) having a delay time (AT) therebetween; 

a gate means (G1 ) for passing a common clock signal (CLK) therethrough in accordance with the 
main clock signal and transmitting a passed common clock signal to said first flip-flops of a first one of 
said stages; and 

75 a plurality of gate means (G 2 , G3,* •*) t eacU connected to one of said second flip-flops, each for 

passing the common clock signal therethrough in accordance with one of the stall signals and 
transmitting a passed common clock signal to said first flip-flops of one of said stages after the first 
one. 

20 13. An apparatus as set forth in claim 12, wherein said first flip-flops of said stages are of a dynamic type, 
said apparatus further comprising means for receiving a refresh signal (REF) to stop generation of 
the main stall signal. 

14. An apparatus as set forth in claim 13, further comprising selector means (SEL1 \ SEU\' • •) for feeding 
25 back output signals of said first flip-flops of the first stage in response to the refresh signal. 

15. An apparatus as set forth in claim 13, wherein said logic gate combination circuit is of a dynamic type, 

said logic gate combination circuit being connected to one of said gate means to receive the 
passed main clock signal, so that said logic gate combination circuit carries out a precharging operation 
30 and a logic operation alternatively. 

16. An apparatus as set forth in claim 12, wherein at least one of said stages includes a plurality of 
parallelly-connected sub stages (STi , STi '), 

said apparatus further comprising a decoding means (G1 , G1 ') for receiving a decoding signal 
35 (DEC) to select one of said sub stages. 
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Fig. 3 A prior art 
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Fig. 4 



CLK 



STU.(STL) 




V D 




V D 


Q 




Q 



DD2T 



ST3 



13 



EP 0 638 858 A1 



Fig. 
Fig. 

Fig. 
Fig. 

Fig. 

Fig. 
Fig. 
Fig. 
Fig. 
Fig. 

Fig. 



to tt ti ts t« t« t« 

5A c LK jijinrLri_ru 

5B 



£>CsTL1 



5D 



CLK1 



5F 
5G 
5H 
51 



CLK 



STL 3 




STL2_l 



5Jc L K3_n_rLrLi 

5K DD>j 



14 



EP 0 638 858 A1 



Fig, 6 



CLK 



FFia 




DDi i 



DOa i 



15 



EP 0 638 858 A1 




5* < _J LL — CO CMCVJO IO fO Q 

-JO K HI J ^ a -1 ^. Q -J XL Q 
O CO CC H- _l H J I— -I 



(0 O CO O CO O 



16 



EP 0 638 858 A1 




17 



European Patent 
Office 



EUROPEAN SEARCH REPORT 



Application Nuaber 

EP 94 11 2140 



DOCUMENTS CONSIDERED TO BE RELEVANT 




Category 


Citation of document with indication, where appropriate, 
of relevant passages 


Relevant 
tedalm 


CLASSIFICATION OP THF 

APPUCATION dota.6) 


X 


P. M. KOGGE /The Architecture of Pipelined 
Computers 1 

1981 , MCGRAW-HILL , NEW YORK, US. 


1-4.6,9, 
12 


G06F1/32 
G06F9/38 


A 


PROCEEDINGS OF THE FALL JOINT COMPUTER 
CONFERENCE 1965 
pages 489 - 504 

L. W. COTTEN 'Circuit Implementation of 
High-Speed Pipeline Systems 1 
* the whole document * 


1,5,12 




A 


PATENT ABSTRACTS OF JAPAN 
vol. 17, no. 522 (P- 1616)20 September 1993 
& JP-A-05 135 592 (NEC CORP.) 1 June 1993 
* abstract * 


11,16 




A 


PATENT ABSTRACTS OF JAPAN 

vol. 9, no. 243 (P-392)30 September 1985 

& JP-A-60 095 643 (FUJITSU K. K.) 29 May 

1985 


11,16 






* abstract * 




TECHNICAL FIELDS 
SEARCHED QM.CU) 


A 


1992 IEEE INTERNATIONAL SYMPOSIUM ON 
CIRCUITS AND SYSTEMS 10 May 1992 , CA US 
pages 208 - 211 

D. H. K. HOE ET AL. 'Pipelining of GaAs 
Dynamic Logic Circuits' 
* the whole document * 


7,8,13 


G06F 


A 


DE-A-28 25 770 (LICENTIA 
PATENT-VERWALTUNGS-GMBH) 3 January 1980 






The present search report has been drawn up for ail claims 

Fbntruvtk «f •( Um wan* 







THE HAGUE 



11 November 1994 



DaskalaMs, T 



CATEGORY OF CITED DOCUMENTS 



X : particularly relevant If taken atone 
Y : particularly relevant M c 

eocamest of the s 
A : technological backgnraa* 

On 
P:i 



T : theory or prtndple uiWeriyfag the laveatloa 
E t earlier patent eoconent, Dot i " 

after the filing tfite 
D : focnneBt dtce la the i 
L : document dies' for other i 

A : meaiber of the same patent family, 



This Page is Inserted by IFW Indexing and Scanning 
Operations and is not part of the Official Record 

BEST AVAILABLE IMAGES 

Defective images within this document are accurate representations of the original 
documents submitted by the applicant. 

Defects in the images include but are not limited to the items checked: 

□ BLACK BORDERS 

□ IMAGE CUT OFF AT TOP, BOTTOM OR SIDES " ' " ' 
jS FADED TEXT OR DRAWING 

{E( BLURRED OR ILLEGIBLE TEXT OR DRAWING 

□ SKEWED/SLANTED IMAGES 

□ COLOR OR BLACK AND WHITE PHOTOGRAPHS 

□ GRAY SCALE DOCUMENTS 

□ LINES OR MARKS ON ORIGINAL DOCUMENT 

□ REFERENCE(S) OR EXHIBIT(S) SUBMITTED ARE POOR QUALITY 

□ OTHER: ~ : 

IMAGES ARE BEST AVAILABLE COPY. 
As rescanning these documents will not correct the image 
problems checked, please do not report these problems to 
the IFW Image Problem Mailbox. 



